-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Add support for multi-value dimensions #112645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multi-value dimensions #112645
Conversation
Closes elastic#110387 Having this in now affords us not having to introduce version checks in the ES exporter later.
Documentation preview: |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
Hi @felixbarny, I've created a changelog YAML for you. |
modules/data-streams/src/yamlRestTest/resources/rest-api-spec/test/data_stream/150_tsdb.yml
Outdated
Show resolved
Hide resolved
modules/data-streams/src/yamlRestTest/resources/rest-api-spec/test/data_stream/150_tsdb.yml
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
I'm still on the fence for adding IP fields to the tsid, it does seem wasteful.. I see that they're part of the ECS template, but this will likely penalize indexing performance. Do we believe there's value in doing so? @martijnvg fyi. |
rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/tsdb/140_routing_path.yml
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general LGTM.
if (values.size() == 1) { | ||
// converts the immutable list that's optimized for the common case of having only one value to a mutable list | ||
BytesReference previousValue = values.get(0); | ||
values = new ArrayList<>(4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the multi valued case, do you often expect 4 ip values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is assuming that when there are multiple values, we'll likely have less than 10 (the default size for ArrayLists), but probably more than 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tried to understand the code change, but I built an image from this PR and it works with our otel setup.
Indexed document in ES with an array dimension named "arraydimension"
{
"_index": ".ds-metrics-generic.otel-default-2024.09.13-000001",
"_id": "cEnVhVfwBBT5RW1zAAABkezcIVI",
"_version": 1,
"_score": 0,
"_source": {
"@timestamp": "2024-09-13T19:28:55.122Z",
"attributes": {
"arraydimension": [
"bar",
"foo"
],
"event.outcome": "success",
"metricset.interval": "1m",
"metricset.name": "transaction",
"processor.event": "metric",
"transaction.name": "parent",
"transaction.result": "Success",
"transaction.root": true,
"transaction.type": "unknown"
},
"data_stream": {
"dataset": "generic.otel",
"namespace": "default",
"type": "metrics"
},
"metrics": {
"transaction.duration.histogram": {
"values": [
15000
],
"counts": [
1
]
}
},
"resource": {
"attributes": {
"agent.name": "opentelemetry/go",
"agent.version": "1.28.0",
"metricset.interval": "1m",
"service.name": "sendotlp",
"some.resource.attribute": "resource.attr",
"telemetry.sdk.language": "go",
"telemetry.sdk.name": "opentelemetry",
"telemetry.sdk.version": "1.28.0"
},
"dropped_attributes_count": 0
},
"scope": {
"dropped_attributes_count": 0,
"name": "otelcol/spanmetricsconnectorv2"
}
},
"fields": {
"resource.attributes.agent.version": [
"1.28.0"
],
"scope.name": [
"otelcol/spanmetricsconnectorv2"
],
"resource.attributes.telemetry.sdk.name": [
"opentelemetry"
],
"resource.attributes.agent.name": [
"opentelemetry/go"
],
"resource.attributes.telemetry.sdk.language": [
"go"
],
"scope.dropped_attributes_count": [
0
],
"service.language.name": [
"go"
],
"transaction.result": [
"Success"
],
"telemetry.sdk.name": [
"opentelemetry"
],
"attributes.transaction.result": [
"Success"
],
"telemetry.sdk.language": [
"go"
],
"transaction.root": [
true
],
"processor.event": [
"metric"
],
"resource.attributes.metricset.interval": [
"1m"
],
"agent.name": [
"opentelemetry/go"
],
"telemetry.sdk.version": [
"1.28.0"
],
"attributes.metricset.name": [
"transaction"
],
"metrics.transaction.duration.histogram": [
{
"values": [
15000
],
"counts": [
1
]
}
],
"attributes.transaction.root": [
true
],
"event.outcome": [
"success"
],
"service.name": [
"sendotlp"
],
"attributes.arraydimension": [
"bar",
"foo"
],
"data_stream.namespace": [
"default"
],
"attributes.transaction.name": [
"parent"
],
"resource.attributes.some.resource.attribute": [
"resource.attr"
],
"some.resource.attribute": [
"resource.attr"
],
"metricset.interval": [
"1m"
],
"data_stream.type": [
"metrics"
],
"transaction.duration.histogram": [
{
"values": [
15000
],
"counts": [
1
]
}
],
"transaction.type": [
"unknown"
],
"metricset.name": [
"transaction"
],
"attributes.processor.event": [
"metric"
],
"@timestamp": [
"2024-09-13T19:28:55.122Z"
],
"resource.attributes.telemetry.sdk.version": [
"1.28.0"
],
"data_stream.dataset": [
"generic.otel"
],
"attributes.event.outcome": [
"success"
],
"attributes.transaction.type": [
"unknown"
],
"agent.version": [
"1.28.0"
],
"arraydimension": [
"bar",
"foo"
],
"attributes.metricset.interval": [
"1m"
],
"transaction.name": [
"parent"
],
"resource.dropped_attributes_count": [
0
],
"resource.attributes.service.name": [
"sendotlp"
]
}
}
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
Closes elastic#110387 Having this in now affords us not having to introduce version checks in the ES exporter later. We can simply use the same serialization logic for metric attributes as we do for other signals. This also enables us to properly map `*.ip` fields to the ip field type as ip fields containing a list of IPs are not converted to a comma-separated list. (cherry picked from commit 8d223cb) # Conflicts: # server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
* Add support for multi-value dimensions (#112645) Closes #110387 Having this in now affords us not having to introduce version checks in the ES exporter later. We can simply use the same serialization logic for metric attributes as we do for other signals. This also enables us to properly map `*.ip` fields to the ip field type as ip fields containing a list of IPs are not converted to a comma-separated list. (cherry picked from commit 8d223cb) # Conflicts: # server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java * Remove skip test for 8.x This was just needed for 8.x to 9.0 compatibility tests
@carsonip this has been merged and back ported to 8.x now. Could you take on the work to remove the workarounds from the ES exporter and the spec? |
The es exporter PR is ready to go: open-telemetry/opentelemetry-collector-contrib#35291 |
Closes #110387
Having this in now affords us not having to introduce version checks in the ES exporter later. We can simply use the same serialization logic for metric attributes as we do for other signals.
This also enables us to properly map
*.ip
fields to the ip field type as ip fields containing a list of IPs are not converted to a comma-separated list.